Qwen3-Omni is Coming: Cross-modal Model on the Edge Gets an Upgrade PR has been submitted to Transformers Library
Alibaba's Qwen3-Omni, a new cross-modal model, is set for release. It supports multiple input/output modalities (text, image, audio, video) and features a Thinker-Talker architecture for efficient deployment.....